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Abstract 

We investigate the joint distribution of the vertex degrees in three models of 
random bipartite graphs. Namely, we can choose each edge with a specified prob- 
ability, choose a specified number of edges, or specify the vertex degrees in one of 
the two colour classes. 

This problem can also be described in terms of the row and sum columns of 
random binary matrix or the in-degrees and out-degrees of a random digraph, in 
which case we can optionally forbid loops. It can also be cast as a problem in 
random hypergraphs, or as a classical occupancy, allocation, or coupon collection 
problem. 

In each case, provided the two colour classes are not too different in size or 
the number of edges too low, we define a probability space based on independent 
binomial variables and show that its probability masses asymptotically equal those 
of the degrees in the graph model almost everywhere. The accuracy is sufficient to 
asymptotically determine the expectation of any joint function of the degrees whose 
maximum is at most polynomially greater than its expectation. 

Our starting points are theorems of Canfield, Greenhill and McKay (2008-2009) 
that enumerate bipartite graphs by degree sequence. The resulting theory is analo- 
gous to that developed by McKay and Wormald (1997) for general graphs. 
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1 Introduction 



We prefer to use graph terminology, but will first describe the setting in the matrix and 
other formulations. Consider a probability space of m x n matrices over {0, 1}. Three 
probability spaces will be considered. In the first case, which we call Qp, some number 
p G (0, 1) is specified and each entry of the matrix is independently equal to 1 with 
probability p and equal to otherwise. In the second case, which we call Qk, some integer 
k is specified, and all m x n binary matrices with exactly k ones have the same probability. 
In the third case, which we call ^t, a list of n integers ti, . . . , t„ is specified, and all m x n 
binary matrices with column sums ti, . . . ,tn, respectively, are equally likely. 

We can interpret the matrix as a bipartite graph in the standard fashion. Associate 
distinct vertices U — {ui, . . . , Um} with the rows, and V — {vi, . . . , f„} with the columns, 
and place an edge between Ui and vj exactly when the matrix entry in position 
equals 1. The row and column sums of the matrix correspond to the degrees of the 
vertices. 

These probability models have also appeared in other settings. Given m bins, at each 
stage j — 1, . . . ,n throw tj balls into distinct bins (with all (^) possible placings equally 
likely). Then the distribution of the number of balls in each bin S — {Si, . . . , Sm) can 
be studied. This model is referred to as allocation by complexes and is precisely our Qt 
model. If we allow the number of balls thrown to be a random variable Tj, binomially 
distributed with parameters {m,p), we attain the Qp model. 

Similarly, in the coupon collection problem a customer repeatedly buys a random 
number, T, of distinct coupons from a set of m possible different coupons. This covers 
both our Qp case when T is binomially distributed with parameters {m,p) and our Qt 
case where Tj = tj with probability 1. (Here, our vector s describes the number of each 
coupon collected and t the number of coupons collected at each stage.) 

Finally, consider a hypergraph on m vertices. At each stage j = 1, . . . ,n, choose at 
random a hyperedge of size tj, allowing multi-edges. Then if we let be the number of 
hyperedges which contain the ith vertex, we obtain the Qt model. 

li m = n, we can also associate the matrix with a directed graph. There are n vertices 
{wi, . . . , Wn}- A matrix entry equal to 1 in position corresponds to a directed edge 
from Wi to Wj. Note that i — j is possible, so these directed graphs can have loops. 
The row and column sums of the matrix correspond to the out-degrees and in-degrees, 
respectively, of the directed graph. We will also treat the case of loop-free digraphs, which 
correspond to square matrices with zero diagonal. Our methods would also work if some 
other limited set of matrix entries are required to be zero, but we have not applied them 
in that case. 
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We now continue using the bipartite graph formulation. For each of the three prob- 
abihty spaces of random bipartite graphs, we seek to examine the (m+?2)-dimensional 
joint distribution of the vertex degrees. If G is a bipartite graph on UUV (respecting the 
partition into U and V), then s = s{G) = (si, . . . , Sm) is the hst of degrees of Mi, . . . , Um, 
and t = t{G) = {ti, . . . ,tn) is the hst of degrees of f i, . . . , Vn- We call the pair (s, t) the 
degree sequence of G. 

Define /„ = {0,1,..., n} and Im,n = ^ Im- ^Iso let G{s,t) be the number of 
bipartite graphs on U U V with degree sequence (s, t). In the case of m = n, we also 
define G{s,t) to be the number of digraphs with in-degrees s and out-degrees t. 

For precision we need to distinguish between random variables (written in uppercase) 
and the values they may take (written in lowercase). For each probability space of random 
graphs, as determined by the context, S = {Si, . . . , 5*^) will denote the random variable 
given by the degrees in U and T = (Ti, . . . , T„) will denote the random variable given by 
the the degrees in V. We will take S to have range J™ and T to have range Also 
define random variables 



As usual, q is an abbreviation for 1 — p. 

Similar results for the degree sequences of ordinary (not necessarily bipartite) graphs 
were obtained by McKay and Wormald [271128]. 

1.1 Historical notes 

The Qt niodel has received wide ranging attention, in particular the distribution of the 
number of isolated vertices. This is also a natural question in the alternative (non-graph) 
wordings of the model. It corresponds to the number of empty bins in the allocation model 
[illllU12pi8p29p36p42j , the number of uncollected coupons in the collector's problem [2illiT] , 
the number of isolated vertices in the hypergraph model and the number of zero rows in 
the binary matrix model [13]. More generally, the number of vertices with a particular 
degree (or range of degrees) in Qt has been studied in allocation [30l[371|38] , graph [T|[22] 
and matrix models [7]. A different extension on this theme is to study the distribution 
of the number of draws required to go from i to j non-empty bins [2l|20l[30l[39lll0]. In a 
similar direction, KhakimuUin and Enatskaya studied the distribution of the number of 
draws to exceed a particular lineup in the bins in the Qt niodel [17] and in the i.i.d. case 
which includes the Qp model as well [19]. The monograph by Kolchin gives many results 
on Qt phrased as the balls and bins model [2T] . 
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We are interested in asymptotic results as we take m, n roughly equal as they tend 
to infinity, but another natural option is to fix m, the number of vertices in one part, 
and let n, the number of vertices in the other part, tend to infinity. There seems to be a 
consistent divide in the literature that when considered as a graph the asymptotics of Qt 
are studied with m, n both tending towards infinity while the balls and bins and coupon 
collection articles (including those cited above) fix m and take n tending toward infinity. 
This corresponds to fixing the number of bins and taking the number of balls to infinity 
or having a fixed number of coupons and letting the number of sampling rounds tend to 
infinity. 

In the other two probability models on bipartite graphs Qp and Qk two types of results 
are known, those on the minimum and maximum degrees [T|[5llM] and those on the number 
of vertices with a given degree [22l[3ll[32] . For results in the digraph counterpart Qp see 
[35] (and below). The model Qp also appears in papers on ball and bin models. Sometimes 
the numbers of balls thrown at each stage are allowed to be i.i.d. random variables |15j . 
If we then set these random variables to be binomially distributed with parameters m, p 
we recover the Qp model. Godbole et. al. [H] make a study of the number of sets of 
r mutually threatening rooks. This corresponds to the number of vertices with h > r 
weighted by {^) in our Qp and Qk models. 

Of the papers cited, we highlight some which concern the minimum and maximum 
degrees, a fixed number of the smallest and largest degrees and the distribution of the h^^ 
largest degree. 

Khakimullin determined the asymptotic distribution of the h^^ largest degree when the 
average degree increases faster than logm [l5]. The model used here allowed the numbers 
of balls allocated at each step to be independent identically distributed random variables 
and so includes both our Qp and uniform Qt cases. This extends an earlier result by the 
same author which gave the asymptotic distribution of the largest degree |16j . 

Palka and Sperling showed that if we fix p such that np = w{n)logn = o{n), then 
any fixed number of the smallest and largest degrees are unique in Qp and in the uniform 
Qt model [35]. A similar result for the Qt model is shown by Palka in [33], where t = 
{d,d, . . . , d) and d = w{n) \ogn = o(n). There is also some work on the degrees in random 
digraphs by Jaworski and Karonski [T3] who showed, in the case that t= {d,d, . . . ,d) and 
d = o{n), that the minimum vertex degree in Qt is almost surely the same as that in Qt- 

1.2 Asymptotic notation 

As we are dealing with asymptotics of functions of many variables, we must be careful to 
define our asymptotic notations. 
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We will tacitly assume that all variables not declared to be constant are functions of 
a single underlying index I that takes values 1,2,..., and that all asymptotic statements 
refer to ^ — >■ oo. Thus, the size parameters m,n are in reality functions m(£) and n(£), 
and a statement like f{m,n) = 0{g{m,n)) means that there is a constant A > such 
that |/(m,n)| < A\g{m,n)\ when i is large enough. This should not be cause for alarm, 
because we will invariably impose conditions implying that m, n — > oo as £ ^ oo. 

The expression 5(1) represents any function of i of magnitude 0(e^"'') for some con- 
stant c > 0. The constant c might be different for different appearances of the notation. 
The class 5(1) is closed under addition, multiplication, taking positive powers, and mul- 
tiplication by polynomials in n. 



1.3 Graph models 

We consider three probability spaces of random graphs, and the probability spaces induced 
on Ijn^n by the corresponding random variables {S, T). 

— * 

1. ip-models Gp, Qp, for < p < 1) Generate G by choosing each of the mn possible 
edges UiVj with probability p, such choices being independent. The probability 
distribution Qp — Qp{m,n) on is that of the degree sequence (5, T) of G. If 
m = n and the edges {uiVi} are forbidden, we obtain the probability distribution 
Qp instead. We have 

„k „mn—k/ 



Probg^(5= s A r = *) = p^q'^^'-'Gis, t), 
where q — 1 — p and k — Yl^i ^i- 



Probe- (5 = s A r= t) = p'^q'' -'^-'=G(s, t) 



{k-models Qk, Qk, for integer k) Generate G by choosing each of the bipartite graphs 
on U U V having k edges, with equal probability. The probability distribution 
Qk — Gkim, n) on Im,n is that of the degree sequence {S, T) oi G. li m — n and the 

— * 

edges {uiVi} are forbidden, we obtain the probability distribution Qk instead. We 
have 

Prob0,(5f= sAT=t) 

-1 

Gis, t), if J2Zi Si = Ei=i = k; 
otherwise. 
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Probg'^(5= sAT=t) 



' — n 



^ i), if Y.l=i Si = EJ=i ij = k; 



otherwise, 

— * 

3. [t-models Qt,Gt, for t G /^) Generate G by choosing each of the bipartite graphs 
OYiU VJV having t{G) = t, with equal probabihty. Since the random variable T is 
fixed at the value t for these graphs, we will define our probability space using S 
only. The probability distribution Qt = Qtijn) on J,™ is that of the degree sequence 
S oiG mU . li m = n and the edges {uiVi} are forbidden, we obtain the probability 
distribution Qt instead. For a given £ G we have 

Probe,(5=.) = n 7 GM. 

Prob,^/5=.) = n(''^"^^) G{s,t). 

The probability spaces Qp, Qk and Qt are clearly related, by mixing and conditioning. 
Note that the first relationships on lines ([2]) and ([3]) are independent of p and assume 
< p < 1. 

fc=o ^ ^ te/,'?i ^j=i ^ ^ 



= ^pI-p^j = (3) 

with similar relations between Q^^ Qk and Qt- 

Note that the separate distributions of S and T in and Qk are elementary. In the 
components of S are independent binomial distributions, while in the Qk model S has a 
multivariate hypergeometric distribution. The difficulty is in quantifying the dependence 
between S and T when all m + n components are considered together. 
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1.4 Binomial models 



Our aim is to compare the degree sequence distributions defined above to some distri- 
butions derived from independent binomials. Our motivating observation is the known 
marginal distributions of S and Tin the models Qp and Qk- 

1. {Independent models TpiTp, for < p < 1) Generate m components distributed 
Bin(n,p) and n components distributed Bin(m,p), all m + n components being 
independent. The joint distribution on „ is Xp — Xp{m,n). If instead we have 
m — n and the 2n components are all distributed Bin(n— the joint distribution 

— * — * 

on In^n is Xp — Xp{n). We have 
Probi^(5 = sA T=t) 

i=i Vsi/ ^.^1 VjJ 

Proh^iS^ sAT^t) 

= pE.--.+E, t.^2n2-2n-E,^i-E,i. JJ ^''^^ JJ (""""^Y 

i=l V «i / V tj J 

2. {Binomial p-models Bp, Bp, for < p < 1) The distribution Bp — Bp{m,n) on „ 
is the conditional distribution of Xp subject to YllLi — Yll=i '^j- ^'^^ m — n, the 
distribution Bp — Bp{n) on is obtained from Xp by the same conditioning. We 
have 



ProbBj5 = sA T^t) 

Probxp(5= sAT=t) 



1^ l^i=l^i ~ l^j=l''j'i 



probx,(Er=i^^ = E;uT,) 

0, otherwise, 
and similarly for Bp. 

3. {Binomial k-models Bk,Bk, for integer k) The distribution Bk = Bk{m,n) on Im,n 
is the conditional distribution of Xp subject to YlT^i ~ X]j=i^i — ^- ^'^^ m = n, 
Bk = Bk{n) is derived from Xp in the same way. In both cases, the distribution 
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doesn't depend on p. We have 
Probe, (5= s ^ T = t) 



0, otherwise, 
Probg-^(5=sAr=t) 




i=l V ' / j=l 

otherwise. 



4. {Binomial t-models Bt, Bt, for t G /"J The distribution Bt = Bt{m, n) on J™ consists 
of the first m components of iS^, where k = Y^^=i'^3- -^^^ m = n, Bt = Bt{n) is 
derived from Ip in the same way. For a given t E I^, we have 



Probe, «) 



Probg-(5= s) = 



0, otherwise, 

0, otherwise. 



5. {Integrated p-models Vp,Vp, for < p < 1) The distribution Vp — Vp{m,n) on „ 

— * — * 

is a mixture of Bp/ distributions, while for m = n the distribution Vp — Vp{n) on 

— * 

In,n is a mixture of Bpi distributions. Let 



KM-{-) ^P{-^(P'-P?), 

\7rpqJ \ pq ) 

fKp{p')dp'. 
Jo 



Then we define 



Probv^(^ =sAT=t) = V{p)-^ [ Kp{p') Probe , (5 = s A T = t) dp', 

Jo ^ 

Prob^ {S= sAT=t) = V{p)-^ [ Kp{p') Viohg XS = s A T = t) dp'. 
^ Jo ^' 
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Our main theorems will show that, under certain conditions, Qp is very close to Vp, Qk 
to Bk, and Qt to Bt- Similar relationships hold for the digraph models. We first record a 
few elementary properties. 

Lemma 1. // = X]j=i^j — ^ and pqmn — )• oo, then 

Probe, (5= sAT=t) 

m 

n \ -i-r / m 



= (2 + 0{{pqmn)-^)) p^^''"'''-^'' ^npqmn JJ JJ 

i=i V**/ 

Probg-^(5= sA T=t) 

= {2 + Oiipqn^))-')) p'\'-'-^--''^,:pqn{n-l) f H f • 

i=l \ / j=l V J 

uniformly over s, t. 

Proof. In Zp, both Si and Yl^=i'^j have the distribution Bin(mn,p). Therefore 

(m n \ mn ^ \2 

:(l + 0((pgmn)"^)). 



where the last line is proved by standard methods. The first claim now follows from the 
formulas for ProbBp(5 = s A T = t) and Probjp(<S' = s A T = t). The second claim is 
proved in the same manner. □ 

Lemma 2. If pqmn oo, then 

V{p) = 1 -0(6-^^""). 

Proof. Kp{p') is a normal density with mean p and variance pq/{2mn), so we just need 
to apply standard normal tail bounds to the definition of V{p). □ 

The next lemma demonstrates how statistics of variables in Bp can be converted into 
statistics in Vp. 

Lemma 3 ([27j). Let X be a random variable on Im,n- Then 

Ey^iX) = V{p)-' fKp{p')¥.s^XX)dp\ 
Jo 

Varv,(X) = V{p)-' [ Kp{p'){VaTs (X) + (Ev,(X) - Eg (X))^) dp'. 
Jo 
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1.5 Enumerative background 



Consider positive integers m, n and real variable x E (0, 1). (As mentioned in Section [L2l 
these variables are actually functions of a background index i.) For constants a,e > 0, 
we say that {m,n,x) is {a, e)- acceptable if 

m, n — !■ oo with m = o(n^^^), n = o(m^^^), and 

' 1 + — + — < alogn. (5) 



4x(l — x) \ 6n 6m 

Note that ([5]) implies x(l — x) = fi((logn)"^). 

For e > 0, a vector (xi, X2, ■ ■ ■ , xn) will be called e-regular if 



uniformly for i = 1, . . . , A^. We say that (s, £) is e-regular if Sj = YTj=i * 
are both e-regular. 

Finally, define \m{.t) = ij^^)~^YTj=i'^j- Y^=i^i — YTj=i'^j^ common value of 
A„(s) and Am(i) will be denoted A. Note that A is the value in [0, 1] that gives the density 
of a bipartite graph with degrees (s, t), relative to Km,n- 

The bases for our analysis are the following enumerative results of Canfield, Greenhill 
and McKay [6,9J. Also see Barvinok and Hartigan [3] for an overlapping result. 

Theorem 4. Let a,b > be constants such that a + b < |. Then there is a constant 
^0 = £o{(^, b) > such that the following is true for any fixed e with Q < e < Eq. If {s, t) 
is e-regular, then 

mn \ -i-r I n\ -n- / m 



i=i ^'"^ j=i ^ 



X)mn J \ A(l — X)mn 

Moreover, if m = n, then 



X exp( -i( 1 - ^1=] ) ( 1 - ) +Oin-') ). 



n — n\ -n I n—l\ -w-T n—\ 



i=i ^ ' / j=i 

X 



exp 1 X)n^ ^ ) ^A(l - A)n2 ^ ) 

Ya=i - ^n)(ti - Xn) 
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1.6 The main theorems 

We now state the theorems that are the main contribution of this paper. Their proofs 
will be given in Section [31 after some preliminary lemmas are given in Section |5J 

Theorem 5. Let constants a,b > satisfy a + b < |. Then there is a constant e = 
e{a, b) > such that the following holds. Let {V, T>') he a pair of probability spaces on 
Im,n in one of the following cases. 

(a) {m,n,p) is {a, e)- acceptable and (V,V') = {Qp,Vp), 

(b) m = n, {n,n,p) is {a, e)- acceptable and (V,V') = {Qp,Vp), 

(c) {m,n, k/mn) is {a, e)- acceptable and (T>,V) = {Qk,Bk), 

(d) m = n, {n,n,k/n'^) is {a ^e)- acceptable and (V,V') = {Qk,l3k), 

Then there is an event B = B(T>) C „ such that Probx)(i?) = 5(1), and uniformly for 
(s, t) G Im.n \ B, 

FTohv{S= sAT=t) = (1 + 0(n-^)) FTohv'{S=sA T= t). 

Moreover, let X : I^ n — )■ M &e a random variable and let E C „ be an event. Then, 

Prob2,(E) = (1 + 0(n-^)) Vmh-D,{E) + 5(1), 

¥.v{X)=¥.v'{X) + 0{n-^)¥.v'{\X\) + 5{l) max |X|, 
Var©(X) = (1 + 0{n-^)) Yaiv'iX) + 5(1) max X\ 

Corollary 6. Let E C be an event. Then, under the conditions of Theorem\^ if 
ProbHp(^) ^ then 



Probg^(E) = 5(1) + o(l) VProbB^(^) , 
so in particular Probgp(-E) — i- 0. Similarly, if m = n and Probg- (E) — )■ then 

Probg;(E) = 5(1) + o(l)VProbg^(E), 
so in particular Prob^- (E) — > 0. 

Theorem 7. Let constants a,b > satisfy a + b < | . Then there is a constant e = 
e{a,b) > such that the following holds whenever {m,n, \m{t)) is {a, e) -acceptable and t 
is e-regular. Let {V, V) he a pair of probability spaces on in one of the following cases. 

(a) {V,V') = {gt,Bt), 
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(h) m = n and {V,V') = {GuBt). 
Then there is an event B = B(V) C J™ such that Probx>(-B) = 5(1), and uniformly for 

Probi,(5 = s) = (1 + 0{n-^)) Probp.(5 = s). 
Moreover, let X : ^ M. be a random variable and let E <Z be an event. Then, 
Prob^,(E) = (1 + 0{n-')) Piohv'iE) + 5(1), 

Et,{X) = Ev'iX) + 0(n-^) Ej,,{\X\) + 5(1) max 



2 



Varp(X) = (1 + 0{n-'')) YaiviX) + 5(1) maxX 



2 Properties of likely degree sequences 

Our first task will be to investigate the bulk behaviour of our various probability spaces, 
in order to identify some behaviour that has probability 5(1). We will apply a few con- 
centration inequalities, which we now give. 

Theorem 8 ([2S])- Let X = {Xi,X2, . . . ,Xi\i) be a family of independent random vari- 
ables, with Xi taking values in a set Ai for each i. Suppose that for each j the function 
f ■■ Uli A ^ ^ satisfies \ f{x) - 
j-th component. Then, for any z. 



f : YliLi M. satisfies \f{x) — /(a/)| < Cj whenever cc, a/ G YliLi differ only in the 



Prob(|/(X)-E(/(X))| >^) <2exp(-2^VEf=iC?)- 

Corollary 9. Let X = {Xi, X2, . . . , X^) be a family of independent real random variables 
such that \Xi — E(Xj)| < q for each i. Define X = Xlili ^i- Then, for any z, 

Prob (|X - E(X) \>z)<2 exp (-i^V Ef=i c\ ) 

Another consequence of Theorem [8] is the following. 

Theorem 10. Let Ai, . . . , A^q be finite sets, and let a^^ . . . ^a^q be integers such that < 
(^i < for each i. Let (^') denote the uniform probability space of Oi- element subsets 
of Ai. Suppose that for each j the function f : YliLi (a*) ~^ ^ satisfies \f{x) — /(a/)| < cj 
whenever a;, a/ G YliLi (a*) ^'^^ same except that their j-th components Xj,x'j have 
\xj n x'j\ = aj — 1 (i.e., the aj-element subsets Xj,x'j are minimally different). If X = 
{Xi, . . . , Xn) is a family of independent set-valued random variables with distributions 

0^---^it)' ^^^^^^ 

Prob(|/(X)-E(/(X))| >z) <2exp( .~f ' , , , 7 
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Proof. We start by reminding the reader of a classical algorithm called "reservoir sam- 
pling", attributed by Knuth to Alan G. Waterman [231 p.l44]. Let y;^|i, . . . , Yjjj^i 
be independent random variables, where vj^'' has the discrete uniform distribution on 
{1, 2, . . . , j}. Now suppose Ai = {wi, . . . , w\Ai\}- Execute the following algorithm: 

For j = 1, . . . , Oj set Xj := Wj ; 

For j = Qi + 1, . . . , \Ai\, if Y-^^ < ai then set x^it) := Wj . 

3 

Define Xi = Xi{Yj\\_i, . . . , ^^^^i) to be the value of {xi, . . . ,Xa,} when the algorithm fin- 
ishes. The raison d'etre of the algorithm, which is easy to check, is that has distribution 
(^'); i.e., it is uniform. It is also easy to check that the maximum change to Xj resulting 
from a change in a sing le Y^^ is that one element is replaced by another. 

Therefore, we can apply Theorem [H] if we consider /(X) as a function of all the in- 
dependent variables {Y^^''}. If < |74j|/2, we can represent Xi by its complement; this 
justifies the term minjoj, \ Ai \ — Oj} in the theorem statement. □ 

We next apply these concentration inequalities to show that certain events are very 
likely in our probability spaces. 

Theorem 11. The following are true for sufficiently small £ > 0. 

(a) Suppose that {m,n,p) and {m,n,k/mn) are {a, e)- acceptable. Then 

Probx)((S', T) is e-regularj = 1 — 5(1) 

for V being any of Qp, Qk, 1p, Bp, Bk, or Vp. The same is true for m = n when V 
is any ofQp, Qk, 1p, Bp, Bk, or Vp. 

(b) If t & is e-regular, and {m,n, \m{t)) is {a, e)- acceptable, then 



Proof. By symmetry, we need only show that S is almost always e-regular. 

In the case that V is Qp or Xp, each Si has the binomial distribution Bin(n,p), and K 
has the distribution Bin(?7,m,p). Therefore, by Corollary |9l 



Probx>(<S' is e-regular) = 1 — 5(1). 

— * — # 

for D being Qt or Bt. The same is true for m = n when T> is either of Qt or Bt. 



Probx)(|5'j — pn\ > n 



I = 




(6) 
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from which it follows that 



Probx)(5' is e- regular) = 1 — o(l). 

The cases that V is Qk, Bp, or Bk follow, since these are the same as slices of Qp or Xp of 
size n~^^^\ using p = k/mn. Also, the distribution of S in Bt is the same as in Bk for 
k = Yl^=i so ^^sX case follows too. 

For V = Qt, note that each Si is the sum of independent variables Xi, . . . , Xn, where 
Xj is a Bernoulli random variable with mean tj/m. The theorem thus follows using the 
same argument as we used for Qp. 

Finally consider V = Vp. Taking X to be the indicator of the event that S is not 
e-regular. Lemmas [2H3] give 

Ev^(X) = 0(1) Kp{p') Eh (X) dp' 
Jo 

/ +/ +/ ]Kp{p')W.s,{X)dp'. 

Jo Jp-n-^+^ Jp+n-^+'^Z 

The first and third integrals are 5(1) since the tails of Kp{p') are small (recall that it is a 
normal density with mean p and variance 0{{mn)~^)), while the second integral is 5(1) 
by the present theorem in the case V = Bp'. (Note that if {171,71, p) is (a, e)-acceptable, 
then all p' & [p — n~^^^ ,p + n~^"^^] are (a', e) for slightly different a'.) 

For the digraph models, the proofs are essentially the same. □ 

Theorem 12. The following are true for sufficiently small £ > 0. 

(a) Suppose that {m,n,p) and {m,n,k/mn) are {a, e)- acceptable. Then 

m 

Viohv[^{Si - riAf = {l + 0{n-^'^^^'))A{l~ A)mn) =1-5(1), (7) 
1=1 

n 

Probi,(^(T, -myl)2 = {1 + 0{m-^/^+^'))A{l - A)mn^ =1-5(1), (8) 

when T) is Qp or Qk- When m = n, the same bounds hold when V is Qp or Qk- 

(b) If t ^ is e-regular, and {m,n,\m{t)) is {a, e)- acceptable, then ([7]) holds whenV 
is Qt, and when m = n and T) is Qf 

(c) Ifm = n, {n,n,p), {n,n,k/n^) and {n,n,Xn{t)) are {a, e)- acceptable, and t G is 
e-regular, then 

n 

Viohv{^{Si~nA){ti-mA) = 0{n-^'^^^')A{l- A)n^'^ = 1-5(1) 

i=l 

14 



when V is Qp, Qk or Qt- 

Proof. Write R = — nA)"^. For i = 1, . . . , m and j = 1, . . . , n, let Xij be the 

indicator for an edge from Ui to Vj. Then some rearrangement of terms yields 

^ m n 
i,i'=l j,j'=l 

where Aaiai = {Xij — Xiij){Xiji — Xj/j/). When V is either Qp or Qt, ^ij is independent 
of Xi/ji if j 7^ j', and Kx>{Xij) is independent of i. This shows that E,j){Aiirjj/) = for 
j 7^ j', leaving us with 



m n 



2m 

i,i'=l j=l 



This gives 



1 " 



Now define R* = Yl^i'^^^^ii^i ~ nA)'^ ,m^^'^'^}. If Sj is changed by 1 for some j, which 
changes A by 1/mn, then minK^j — nA)'^, m^'^'^^} changes by 0{m^^'^^'^) for i = j and by 
©(m"^/^"*"^) for i j. Consequently, R* changes by 0{m^^'^~^^). Applying Theorem [HI we 
find that 

Fiohv{\R* - ^viR*)\ > ^m'+'n^/^+'/^) = 5(1) 
for P = Qp. It also holds for V = Qt, using Theorem [TO] in the same way. 

Now Theorem [TTl shows that Probx>(i? 7^ R*) = 5(1), which implies that E,o{R*) = 
Kd{R) +5(1). Therefore we can argue 

Frohv{\R-Ev{R)\ > m^+^ni/2+./2^) 

< Probi,(i? ^ R*) + FTohv{\R* - ^v{R)\ > m^+"n^/2+e/2^ 

< 5(1) + Probi,(|i?* - Et,{R*)\ > mi+^r2i/2+e/2 ^ 
= 5(1). 

We also have that A is fixed at the value Am(t) = (mn)^^ X]j=i and that 

Probg^(|/l-J9| >n-l+2e^ 

by Qj. From these bounds, inequality ([7]) follows for and Qt, and ([8]) follows for Qp by 
symmetry. By choosing p = k/mn and noting that Qk is a slice of size n^*^*^^^ of ^p, the 
theorem is proved for Qk too. 
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For V = Qp,Qk,Qt, the proofs of ([7]) and ([8]) follow the same pattern. Since still 
holds, we can note that 'K^^{Aii'jj') = Egp(Z\jj/jj/) and Eg-^(Z\jj/jj/) = Egj(Z\jj/jj/) unless 
{3,3'} ^ to infer that E^^{R) = Eg^{R)+0{n) and Eg^{R) = EgXR)+0{n). This 

is enough to ensure that the rest of the proof continues in the same way. (For the record, 
Eg^^{R)=pq{n-lY.) 

— * 

We now prove part (c). Take T) = Qt first, with £ being e-regular and {n,n, Xn{t)) 
being (a, £)-acceptable. We have 

E.(s,) = y^ = ^ — ^, 

n-1 n-V 

from which it follows that 

n 

1=1 

Now we can apply Theorem [10] to conclude that (c) holds. In the case of V = Qp, 
Theorem [TT] says that T is e-regular with probability 1 — 5(1), so (c) holds in that case 
too. Finally, Qk is a substantial slice of Qp ii p = k/n^, so (c) holds for Qk as well. □ 



n — 1 



3 Proofs of the main theorems 

In this section we will prove the theorems and corollaries in Section 11.61 

We first consider Qp. Suppose that a,b > are constants with a + b < |, and that 
{171,71, p) is (a, £:)-acceptable. According to Theorems HI ITT] and [T2| and ([6]), there is an 
event B C 1^^ such that PTohg^{B) = o(l) and, for (s, t) ^ B, 

\K — pmn\ < mn^^ , (10) 

Prob,,(5 =s^T=t)= p\—^ exp(0(n-^)) {^^^ g (^"J H (") ' 

= p^\---^^ ^^^^n n {''\ n frl 

xexp(<^^i^+0(n-')) (11) 

\ Ipqmn J 

f*^^ Ylii — Ylij — where the last step follows by Stirling's formula and, as always, 
we are assuming that e is sufficiently small. 
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We wish to show that (11 II) closely matches the probability in Vp. Define P{p, s, t) 
Prob0p(5 = s A T = t). By the definition of Vp, we have 

Proby,(5 =sAT=t) = V{p)-' [ Kp{p')Pip', s, t) dp'. 

Jo 



By Section [LH item 2, we have 



Pip',s,t) probx,(Er=i^. = E;=iT,) /p'VVi 



p 



I \ 2mn~2k 



P{p,s,t) Vmh^,{YZ^S^ = Y.%^T,)\p) \l-p) ' ^^^^ 

We will divide the integral into three parts. Define Jp = [p — n^^^^^,p + n~^'^^^]. By 
Lemma [Hand ( fTOj) . for p' G Jp and {s, t) ^ B , we have 

— — = exp [P -P) [P -P) +0{n / , 13 

P{p,s,t) \ pq pq J 

which gives 

/ Kp{p')P{p\ s, t) dp' = 2-'/'P{p, s, t) e^p p -P^^)^ + o{n-'/': 



To bound the integral outside Jp, note that {p' / p)'^^ {{)- —p')/{l ~ p)Y^^ '^^ is increasing 
for p' < p — 72^i+3e and decreasing for p > p + n^^^^'^ . Also, since the mean square of a 
set of numbers is at least as large as the square of their mean, we can infer from (jlj) that 
Probj^, ^ = X]j=i ^i) — ("^'^ + 1)^^ for all p' . Since mno(l) = 5(1), we obtain 
from ^ that 

/ Kp{p')P{p',s,t)dp' = d{l)P{p,s,t). 
Recalling Lemma [2l we conclude that 

V{p)-^ [' Kp{p')P{p',s,t)dp' 
Jo 

= 2-/2p(p, ,) expf i^f^ + O(n-V^)) , 
Y zpqmn J 

which matches (1111) when the value of P{p, s, t) given by Lemma [T] is substituted. This 
completes the proof of the first claim of Theorem [S](a). The next two claims follow on 
summing the first claim over all {s,t). For the variance, we can apply the formula for the 
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expectation to argue 

Varg^(X) = minEg^(X-/i)2 



min (5(1) max (X - ^if + (1 + 0{n-^)) Ev,(X - ^if) 



= min (5(1) maxX^ + (1 + Oin^^)) Ev (X - ^if) 
= 5(1) maxX^ + (1 + 0{n^^)) minEv,(X - ^if 

(s,t) /jSM 

= 5(1) maxX^ + (1 + 0(n"'')) Vary, (X). 

(s,t) 

For the third hne we have used the obvious fact that the minimum in the first hne occurs 
somewhere in the interval [min X, max X] . 

The proof of Theorem [5]^b) is the same. To prove Theorem [5](c), note that according 
to Theorems m [TT] and WI\ there is an event B C J^ „^ such that Probgj.(i?) = 5(1) and, 
for (s, t) ^ B, 

Prob,, (^ =sAT=t)= exp(0(n-)) (^^j' fj (jj (") ' 

i — 1 j — 1 

which matches Probe^(S^= s A T = £) up to the error term. Similarly for Theorem |5]^d). 
Theorem [7] follows from a similar argument, on noting that the e- regularity of t implies 

n 

- Am)2 < n2+2^ < m^"A(l - X)mn. 

i=i 

Finally, we prove Corollary |6] for V = Qp, which is representative of the two cases. In 
view of Theorem |5l it will suffice to prove that 



Probv,(^) < 5(1) + o(l)VProbB^(E) (14) 

if ProbH^(E) ^ 0. Define 

y = maxjn^ V -logiFi oh b^{E)) - 1 loglog(ProbBp(E)) | 

and 

E = {{s,t) eE:\K — pmn\ < y^pqmn }. 

By a suitable normal approximation of the binomial distribution, such as [26] Thm. 3], 
Probg^(E \ ^) = 0{e~-y'/'^/y), so by Theorem U Probv,(S \ ^) = 5(1) + 0(e~s^'/Vy). 
Also note that 

Kp{p')dp' = 0{e-y'"/y). 



Ik 



\p' —p\>y^Jpq/2mn 
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Therefore, since V{p) ^ = 1 + o(l) by Lemma [21 

Probv,(^) = 5(1) + 0{e-y"/yy) + [ Kp{p') ProbB^,(E) dp. 

J \p' —p\<y -^J pq /2mn 

Now note that, by ( fT3l) . for \p' — p\ < pql^mn and \k — pmn\ < y^pqmn we have 

Probg^,(5= s^T=t) 
ProbB^(5 = sA T=t) 

< exp ( ^At^PH^^p' -p)-!!^ip'- pf + Oin-^/^)) 

\ pq pq J 

< (l + 0(n-i/2))e2'V2 

and so 

Viohrs^,{E) < {l + 0{n~''^))ey"'^VTohB,{E). 
Since J Kp{p') dp < 1, we have proved that 

Probv,(^) < 5(1) + 0{e-y"/yy) + (1 + o(l))e^'/2 ProbB^(^), 

which gives ( 11^ when the value of y is substituted. 

4 Concluding remarks 

A theorem similar to Theorem H] holds also in the sparse domain. This was shown by 
Greenhill, McKay and Wang in the case that (maxj Sj)(maxj t^) = o((^jSi)^/^) [lOj. 
That theorem can probably be used to develop a similar theory of degree sequences in 
that domain. However the lack of a precise enumeration in the gap between the sparse 
domain and the dense domain of Theorem H] currently thwarts a theory which spans both 
the sparse and dense domains. 
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